Overview

Dataset statistics

Number of variables9
Number of observations79215
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.4 MiB
Average record size in memory72.0 B

Variable types

Numeric9

Alerts

X_38 is highly correlated with X_39 and 1 other fieldsHigh correlation
X_39 is highly correlated with X_38 and 1 other fieldsHigh correlation
X_40 is highly correlated with X_38 and 1 other fieldsHigh correlation
X_41 is highly correlated with X_43High correlation
X_42 is highly correlated with X_44 and 1 other fieldsHigh correlation
X_43 is highly correlated with X_41 and 1 other fieldsHigh correlation
X_44 is highly correlated with X_42High correlation
X_45 is highly correlated with X_42 and 1 other fieldsHigh correlation
X_38 is highly correlated with X_39 and 1 other fieldsHigh correlation
X_39 is highly correlated with X_38High correlation
X_40 is highly correlated with X_38High correlation
X_41 is highly correlated with X_43High correlation
X_42 is highly correlated with X_44 and 1 other fieldsHigh correlation
X_43 is highly correlated with X_41 and 1 other fieldsHigh correlation
X_44 is highly correlated with X_42High correlation
X_45 is highly correlated with X_42 and 1 other fieldsHigh correlation
X_38 is highly correlated with X_39 and 1 other fieldsHigh correlation
X_39 is highly correlated with X_38 and 1 other fieldsHigh correlation
X_40 is highly correlated with X_38 and 1 other fieldsHigh correlation
X_41 is highly correlated with X_43High correlation
X_42 is highly correlated with X_44High correlation
X_43 is highly correlated with X_41 and 1 other fieldsHigh correlation
X_44 is highly correlated with X_42High correlation
X_45 is highly correlated with X_43High correlation
X_38 is highly correlated with X_41 and 1 other fieldsHigh correlation
X_39 is highly correlated with X_40High correlation
X_40 is highly correlated with X_39High correlation
X_41 is highly correlated with X_38 and 4 other fieldsHigh correlation
X_42 is highly correlated with X_38 and 3 other fieldsHigh correlation
X_43 is highly correlated with X_41 and 2 other fieldsHigh correlation
X_44 is highly correlated with X_41 and 2 other fieldsHigh correlation
X_45 is highly correlated with X_41 and 2 other fieldsHigh correlation
X_38 is highly skewed (γ1 = 21.53589587) Skewed
df_index is uniformly distributed Uniform

Reproduction

Analysis started2022-08-07 05:51:47.453350
Analysis finished2022-08-07 05:52:00.996080
Duration13.54 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIFORM

Distinct39608
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19803.25
Minimum0
Maximum39607
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size619.0 KiB
2022-08-07T14:52:01.093096image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1980
Q19901.5
median19803
Q329705
95-th percentile37626.3
Maximum39607
Range39607
Interquartile range (IQR)19803.5

Descriptive statistics

Standard deviation11433.77256
Coefficient of variation (CV)0.5773684907
Kurtosis-1.199999999
Mean19803.25
Median Absolute Deviation (MAD)9902
Skewness1.6561712 × 10-9
Sum1568714449
Variance130731155.1
MonotonicityNot monotonic
2022-08-07T14:52:01.257781image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02
 
< 0.1%
264072
 
< 0.1%
264002
 
< 0.1%
264012
 
< 0.1%
264022
 
< 0.1%
264032
 
< 0.1%
264042
 
< 0.1%
264052
 
< 0.1%
264062
 
< 0.1%
264082
 
< 0.1%
Other values (39598)79195
> 99.9%
ValueCountFrequency (%)
02
< 0.1%
12
< 0.1%
22
< 0.1%
32
< 0.1%
42
< 0.1%
52
< 0.1%
62
< 0.1%
72
< 0.1%
82
< 0.1%
92
< 0.1%
ValueCountFrequency (%)
396071
< 0.1%
396062
< 0.1%
396052
< 0.1%
396042
< 0.1%
396032
< 0.1%
396022
< 0.1%
396012
< 0.1%
396002
< 0.1%
395992
< 0.1%
395982
< 0.1%

X_38
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct267
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-15.90735101
Minimum-17.09
Maximum32.23
Zeros0
Zeros (%)0.0%
Negative79214
Negative (%)> 99.9%
Memory size619.0 KiB
2022-08-07T14:52:01.426626image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-17.09
5-th percentile-16.33
Q1-16.16
median-15.99
Q3-15.75
95-th percentile-15.23
Maximum32.23
Range49.32
Interquartile range (IQR)0.41

Descriptive statistics

Standard deviation0.5320721989
Coefficient of variation (CV)-0.03344819629
Kurtosis1140.399844
Mean-15.90735101
Median Absolute Deviation (MAD)0.2
Skewness21.53589587
Sum-1260100.81
Variance0.2831008248
MonotonicityNot monotonic
2022-08-07T14:52:01.579384image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-16.111499
 
1.9%
-16.081453
 
1.8%
-16.041401
 
1.8%
-16.021396
 
1.8%
-15.991392
 
1.8%
-16.131376
 
1.7%
-16.171329
 
1.7%
-16.191291
 
1.6%
-16.231204
 
1.5%
-16.11135
 
1.4%
Other values (257)65739
83.0%
ValueCountFrequency (%)
-17.091
< 0.1%
-17.051
< 0.1%
-17.041
< 0.1%
-17.021
< 0.1%
-17.011
< 0.1%
-16.951
< 0.1%
-16.941
< 0.1%
-16.932
< 0.1%
-16.911
< 0.1%
-16.881
< 0.1%
ValueCountFrequency (%)
32.231
 
< 0.1%
-2.6561
0.1%
-14.11
 
< 0.1%
-14.121
 
< 0.1%
-14.21
 
< 0.1%
-14.251
 
< 0.1%
-14.261
 
< 0.1%
-14.351
 
< 0.1%
-14.381
 
< 0.1%
-14.391
 
< 0.1%

X_39
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct264
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-15.89356599
Minimum-17.09
Maximum-2.65
Zeros0
Zeros (%)0.0%
Negative79215
Negative (%)100.0%
Memory size619.0 KiB
2022-08-07T14:52:01.733266image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-17.09
5-th percentile-16.34
Q1-16.16
median-15.99
Q3-15.75
95-th percentile-15.24
Maximum-2.65
Range14.44
Interquartile range (IQR)0.41

Descriptive statistics

Standard deviation0.7055989191
Coefficient of variation (CV)-0.04439525525
Kurtosis268.278863
Mean-15.89356599
Median Absolute Deviation (MAD)0.19
Skewness14.534386
Sum-1259008.83
Variance0.4978698346
MonotonicityNot monotonic
2022-08-07T14:52:01.885208image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-16.111511
 
1.9%
-16.081488
 
1.9%
-16.131449
 
1.8%
-16.021418
 
1.8%
-16.041385
 
1.7%
-16.171329
 
1.7%
-15.991310
 
1.7%
-16.191302
 
1.6%
-15.951242
 
1.6%
-16.231221
 
1.5%
Other values (254)65560
82.8%
ValueCountFrequency (%)
-17.091
 
< 0.1%
-17.071
 
< 0.1%
-16.992
 
< 0.1%
-16.972
 
< 0.1%
-16.931
 
< 0.1%
-16.912
 
< 0.1%
-16.892
 
< 0.1%
-16.883
< 0.1%
-16.861
 
< 0.1%
-16.845
< 0.1%
ValueCountFrequency (%)
-2.65173
0.2%
-14.111
 
< 0.1%
-14.151
 
< 0.1%
-14.161
 
< 0.1%
-14.211
 
< 0.1%
-14.241
 
< 0.1%
-14.261
 
< 0.1%
-14.311
 
< 0.1%
-14.331
 
< 0.1%
-14.361
 
< 0.1%

X_40
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct263
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-16.57232961
Minimum-17.75
Maximum-14.78
Zeros0
Zeros (%)0.0%
Negative79215
Negative (%)100.0%
Memory size619.0 KiB
2022-08-07T14:52:02.046808image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-17.75
5-th percentile-16.99
Q1-16.81
median-16.64
Q3-16.4
95-th percentile-15.88
Maximum-14.78
Range2.97
Interquartile range (IQR)0.41

Descriptive statistics

Standard deviation0.3444261602
Coefficient of variation (CV)-0.02078320721
Kurtosis1.383041925
Mean-16.57232961
Median Absolute Deviation (MAD)0.2
Skewness1.086859227
Sum-1312777.09
Variance0.1186293798
MonotonicityNot monotonic
2022-08-07T14:52:02.388258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-16.721432
 
1.8%
-16.761423
 
1.8%
-16.71408
 
1.8%
-16.791380
 
1.7%
-16.671377
 
1.7%
-16.811355
 
1.7%
-16.631316
 
1.7%
-16.851306
 
1.6%
-16.611182
 
1.5%
-16.881137
 
1.4%
Other values (253)65899
83.2%
ValueCountFrequency (%)
-17.751
< 0.1%
-17.722
< 0.1%
-17.691
< 0.1%
-17.621
< 0.1%
-17.591
< 0.1%
-17.581
< 0.1%
-17.562
< 0.1%
-17.552
< 0.1%
-17.532
< 0.1%
-17.521
< 0.1%
ValueCountFrequency (%)
-14.781
 
< 0.1%
-14.81
 
< 0.1%
-14.831
 
< 0.1%
-14.881
 
< 0.1%
-14.972
< 0.1%
-151
 
< 0.1%
-15.011
 
< 0.1%
-15.051
 
< 0.1%
-15.061
 
< 0.1%
-15.074
< 0.1%

X_41
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct44
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.1871364
Minimum20.73
Maximum21.62
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size619.0 KiB
2022-08-07T14:52:02.549885image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum20.73
5-th percentile21.14
Q121.17
median21.19
Q321.21
95-th percentile21.24
Maximum21.62
Range0.89
Interquartile range (IQR)0.04

Descriptive statistics

Standard deviation0.03086650198
Coefficient of variation (CV)0.001456851053
Kurtosis3.214408381
Mean21.1871364
Median Absolute Deviation (MAD)0.02
Skewness-0.08115085437
Sum1678339.01
Variance0.0009527409446
MonotonicityNot monotonic
2022-08-07T14:52:02.695898image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
21.1911427
14.4%
21.1710227
12.9%
21.189520
12.0%
21.28385
10.6%
21.218289
10.5%
21.166639
8.4%
21.155454
6.9%
21.225045
6.4%
21.233829
 
4.8%
21.142804
 
3.5%
Other values (34)7596
9.6%
ValueCountFrequency (%)
20.731
< 0.1%
20.781
< 0.1%
20.812
< 0.1%
20.852
< 0.1%
20.861
< 0.1%
20.891
< 0.1%
20.971
< 0.1%
20.981
< 0.1%
20.991
< 0.1%
21.012
< 0.1%
ValueCountFrequency (%)
21.621
 
< 0.1%
21.511
 
< 0.1%
21.332
 
< 0.1%
21.326
 
< 0.1%
21.3123
 
< 0.1%
21.343
 
0.1%
21.2965
 
0.1%
21.28158
 
0.2%
21.27288
0.4%
21.26652
0.8%

X_42
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct45
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.05950237
Minimum20.79
Maximum21.44
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size619.0 KiB
2022-08-07T14:52:02.851954image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum20.79
5-th percentile21
Q121.03
median21.06
Q321.09
95-th percentile21.13
Maximum21.44
Range0.65
Interquartile range (IQR)0.06

Descriptive statistics

Standard deviation0.04023339753
Coefficient of variation (CV)0.001910462879
Kurtosis0.4649611753
Mean21.05950237
Median Absolute Deviation (MAD)0.03
Skewness0.05199705833
Sum1668228.48
Variance0.001618726277
MonotonicityNot monotonic
2022-08-07T14:52:02.999428image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
21.068419
10.6%
21.057717
9.7%
21.087441
9.4%
21.047207
9.1%
21.076645
8.4%
21.036368
 
8.0%
21.095170
 
6.5%
21.15126
 
6.5%
21.024755
 
6.0%
21.014185
 
5.3%
Other values (35)16182
20.4%
ValueCountFrequency (%)
20.791
 
< 0.1%
20.811
 
< 0.1%
20.881
 
< 0.1%
20.8910
 
< 0.1%
20.922
 
< 0.1%
20.9124
 
< 0.1%
20.9249
 
0.1%
20.9363
 
0.1%
20.94115
0.1%
20.95176
0.2%
ValueCountFrequency (%)
21.441
 
< 0.1%
21.313
 
< 0.1%
21.292
 
< 0.1%
21.284
< 0.1%
21.271
 
< 0.1%
21.252
 
< 0.1%
21.245
< 0.1%
21.238
< 0.1%
21.226
< 0.1%
21.216
< 0.1%

X_43
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct54
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.20391706
Minimum20.8
Maximum21.41
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size619.0 KiB
2022-08-07T14:52:03.159792image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum20.8
5-th percentile21.13
Q121.17
median21.2
Q321.24
95-th percentile21.28
Maximum21.41
Range0.61
Interquartile range (IQR)0.07

Descriptive statistics

Standard deviation0.04727842814
Coefficient of variation (CV)0.002229702559
Kurtosis0.6533989495
Mean21.20391706
Median Absolute Deviation (MAD)0.03
Skewness-0.1374009512
Sum1679668.29
Variance0.002235249768
MonotonicityNot monotonic
2022-08-07T14:52:03.312944image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21.197530
 
9.5%
21.217475
 
9.4%
21.26345
 
8.0%
21.175883
 
7.4%
21.235713
 
7.2%
21.185709
 
7.2%
21.225583
 
7.0%
21.245308
 
6.7%
21.163910
 
4.9%
21.253750
 
4.7%
Other values (44)22009
27.8%
ValueCountFrequency (%)
20.81
 
< 0.1%
20.841
 
< 0.1%
20.871
 
< 0.1%
20.891
 
< 0.1%
20.91
 
< 0.1%
20.922
< 0.1%
20.931
 
< 0.1%
20.951
 
< 0.1%
20.961
 
< 0.1%
20.973
< 0.1%
ValueCountFrequency (%)
21.411
 
< 0.1%
21.44
 
< 0.1%
21.397
 
< 0.1%
21.387
 
< 0.1%
21.3719
 
< 0.1%
21.3625
 
< 0.1%
21.3565
 
0.1%
21.3490
 
0.1%
21.33170
0.2%
21.32304
0.4%

X_44
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct37
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.16023266
Minimum20.93
Maximum21.32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size619.0 KiB
2022-08-07T14:52:03.466474image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum20.93
5-th percentile21.1
Q121.13
median21.16
Q321.19
95-th percentile21.22
Maximum21.32
Range0.39
Interquartile range (IQR)0.06

Descriptive statistics

Standard deviation0.04214197817
Coefficient of variation (CV)0.001991564972
Kurtosis-0.1111087146
Mean21.16023266
Median Absolute Deviation (MAD)0.03
Skewness-0.3667056325
Sum1676207.83
Variance0.001775946324
MonotonicityNot monotonic
2022-08-07T14:52:03.606403image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
21.198156
10.3%
21.127209
9.1%
21.147202
9.1%
21.136920
8.7%
21.26738
8.5%
21.216626
8.4%
21.155974
 
7.5%
21.185380
 
6.8%
21.174686
 
5.9%
21.113983
 
5.0%
Other values (27)16341
20.6%
ValueCountFrequency (%)
20.931
 
< 0.1%
20.941
 
< 0.1%
20.954
 
< 0.1%
20.968
 
< 0.1%
20.973
 
< 0.1%
20.988
 
< 0.1%
20.9919
 
< 0.1%
2130
 
< 0.1%
21.0174
0.1%
21.02119
0.2%
ValueCountFrequency (%)
21.321
 
< 0.1%
21.282
 
< 0.1%
21.272
 
< 0.1%
21.2619
 
< 0.1%
21.25137
 
0.2%
21.24654
 
0.8%
21.231951
 
2.5%
21.223610
4.6%
21.216626
8.4%
21.26738
8.5%

X_45
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct40
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.154575901
Minimum0
Maximum0.42
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size619.0 KiB
2022-08-07T14:52:03.763991image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.08
Q10.12
median0.15
Q30.19
95-th percentile0.23
Maximum0.42
Range0.42
Interquartile range (IQR)0.07

Descriptive statistics

Standard deviation0.04718884696
Coefficient of variation (CV)0.3052794559
Kurtosis-0.353650489
Mean0.154575901
Median Absolute Deviation (MAD)0.03
Skewness0.2462375801
Sum12244.73
Variance0.002226787277
MonotonicityNot monotonic
2022-08-07T14:52:03.905095image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
0.136166
 
7.8%
0.126049
 
7.6%
0.145833
 
7.4%
0.165773
 
7.3%
0.155700
 
7.2%
0.115465
 
6.9%
0.175449
 
6.9%
0.185188
 
6.5%
0.194763
 
6.0%
0.14462
 
5.6%
Other values (30)24367
30.8%
ValueCountFrequency (%)
01
 
< 0.1%
0.013
 
< 0.1%
0.0210
 
< 0.1%
0.0334
 
< 0.1%
0.04124
 
0.2%
0.05308
 
0.4%
0.06646
 
0.8%
0.071206
 
1.5%
0.082086
2.6%
0.093259
4.1%
ValueCountFrequency (%)
0.422
 
< 0.1%
0.392
 
< 0.1%
0.381
 
< 0.1%
0.363
 
< 0.1%
0.353
 
< 0.1%
0.345
 
< 0.1%
0.337
 
< 0.1%
0.3210
 
< 0.1%
0.3132
< 0.1%
0.360
0.1%

Interactions

2022-08-07T14:51:59.329583image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:48.724382image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:50.044477image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:51.447159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:52.739926image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.011869image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:55.257230image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:56.538676image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:58.016363image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:59.472284image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:48.882747image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:50.186729image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:51.601295image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:52.888093image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.160362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:55.408231image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:56.883323image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:58.171621image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:59.601273image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:49.021406image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:50.311395image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:51.735792image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:53.023700image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.288026image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:55.543632image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:57.018477image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:58.309108image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:59.739908image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:49.170373image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:50.451027image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:51.882879image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:53.167284image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.429168image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:55.689556image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:57.165658image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:58.461704image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:59.875448image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:49.316983image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:50.772182image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:52.025851image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:53.307939image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.565468image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:55.831837image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:57.305137image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:58.605326image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:52:00.006728image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:49.457251image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:50.901839image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:52.160496image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:53.440564image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.708578image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:55.967626image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:57.440894image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:58.744244image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:52:00.146386image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:49.605406image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:51.037482image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:52.307473image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:53.585202image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.847714image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:56.111553image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:57.585242image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:58.891507image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:52:00.284994image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:49.753897image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:51.176116image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:52.454814image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:53.727717image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:54.986868image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:56.254524image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:57.732142image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:59.040335image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:52:00.428616image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:49.906857image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:51.319737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:52.605305image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:53.876202image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:55.129609image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:56.404576image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:57.880749image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-07T14:51:59.191945image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-08-07T14:52:04.029073image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-07T14:52:04.193606image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-07T14:52:04.358655image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-07T14:52:04.523716image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-07T14:52:00.609164image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-07T14:52:00.843551image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexX_38X_39X_40X_41X_42X_43X_44X_45
00-16.41-16.36-17.0321.2020.9921.2821.090.29
11-16.06-16.11-16.7421.1621.0321.1621.130.13
22-16.16-16.17-16.7621.1321.0321.1721.120.14
33-16.05-16.03-16.6721.1820.9821.2021.090.22
44-16.25-16.23-16.8521.1620.9621.1821.100.22
55-16.45-16.50-17.1421.1721.0721.1921.160.12
66-15.60-15.58-16.2321.1820.9921.2021.100.21
77-16.08-15.99-16.6821.1921.0021.2321.120.23
88-15.85-15.91-16.5421.1721.0721.1721.160.10
99-16.03-16.00-16.6321.1321.0021.1721.100.17

Last rows

df_indexX_38X_39X_40X_41X_42X_43X_44X_45
7920539598-15.54-15.48-16.1921.2421.1021.2721.180.17
7920639599-15.74-15.75-16.4221.2121.1121.2721.180.16
7920739600-16.03-16.01-16.6421.1821.0321.2421.100.21
7920839601-16.03-16.08-16.7221.1721.1221.2321.210.11
7920939602-16.04-16.02-16.6321.1721.0421.1721.130.13
7921039603-16.17-16.26-16.8821.1621.1321.2421.190.11
7921139604-16.11-16.10-16.7321.1621.0321.2221.120.19
7921239605-16.23-16.32-16.9321.1621.1121.2321.170.12
7921339606-15.99-16.05-16.6721.1821.1021.2121.190.11
7921439607-15.75-15.81-16.4421.1721.1021.2321.180.13